RARD: The Related-Article Recommendation Dataset

نویسندگان

  • Jöran Beel
  • Zeljko Carevic
  • Johann Schaible
  • Gábor Neusch
چکیده

Recommender-system datasets are used for recommender-system offline evaluations, training machine-learning algorithms, and exploring user behavior. While there are many datasets for recommender systems in the domains of movies, books, and music, there are rather few datasets from research-paper recommender systems. In this paper, we introduce RARD, the Related-Article Recommendation Dataset, from the digital library Sowiport and the recommendation-as-a-service provider Mr. DLib. The dataset contains information about 57.4 million recommendations that were displayed to the users of Sowiport. Information includes details on which recommendation approaches were used (e.g. content-based filtering, stereotype, most popular), what types of features were used in content based filtering (simple terms vs. keyphrases), where the features were extracted from (title or abstract), and the time when recommendations were delivered and clicked. In addition, the dataset contains an implicit item-item rating matrix that was created based on the recommendation click logs. RARD enables researchers to train machine learning algorithms for research-paper recommendations, perform offline evaluations, and do research on data from Mr. DLib’s recommender system, without implementing a recommender system themselves. In the field of scientific recommender systems, our dataset is unique. To the best of our knowledge, there is no dataset with more (implicit) ratings available, and that many variations of recommendation algorithms. The dataset is available at http://data.mr-dlib.org, and published under the “Creative Commons Attribution 3.0 Unported (CC-BY)” license.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach

In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...

متن کامل

Recommendation System for Criminal Behavioral Analysis on Social Network using Genetic Weighted K-Means Clustering

The accessibility and usage of social networking sites constructs both prospects and menaces for the users. In this research article, we propose a new recommendation system for predicting and recommending the criminal behavioral users on social network based upon the activities of the users. Our recommender system uses the proposed nine factor analysis method, clustering technique called Geneti...

متن کامل

A hybrid approach of feature extraction for content-based recommender system

Recommendation systems have gained popularity in the last decades, having a big impact on business models and consumers. This article describes a hybrid approach on feature extraction on a recommender system for business recommendations using a large scale dataset provided by Yelp, one of the most popular review websites. We extract features from the provided dataset using a hybrid technique an...

متن کامل

Assessment of the completeness of Volunteered Geographic Information focusing on building blocks data (Case Study: Tehran metropolis)

Open Street Map (OSM) is currently the largest collection of volunteered geographic data, widely used in many projects as an alternative to or integrated with authoritative data. However, the quality of these data has been one of the obstacles to the widely use of it. In this article, from among the elements related to the quality of volunteered geographic data, we have tried to examine the com...

متن کامل

Collaborative filtering models for recommendations systems

Modern retailers frequently use recommendation systems to suggest products of interest to a collection of consumers. A closely related task is ratings prediction, in which the system predicts a numerical rating that a user u will assign to a product p. In this paper, we build three ratings prediction models for a dataset of products and users from Amazon.com and Yelp.com. We evaluate the streng...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • D-Lib Magazine

دوره 23  شماره 

صفحات  -

تاریخ انتشار 2017